65 research outputs found
Learning and Testing Variable Partitions
Let be a multivariate function from a product set to an
Abelian group . A -partition of with cost is a partition of
the set of variables into non-empty subsets such that is -close to
for some with
respect to a given error metric. We study algorithms for agnostically learning
partitions and testing -partitionability over various groups and error
metrics given query access to . In particular we show that
Given a function that has a -partition of cost , a partition
of cost can be learned in time
for any .
In contrast, for and learning a partition of cost is NP-hard.
When is real-valued and the error metric is the 2-norm, a
2-partition of cost can be learned in time
.
When is -valued and the error metric is Hamming
weight, -partitionability is testable with one-sided error and
non-adaptive queries. We also show that even
two-sided testers require queries when .
This work was motivated by reinforcement learning control tasks in which the
set of control variables can be partitioned. The partitioning reduces the task
into multiple lower-dimensional ones that are relatively easier to learn. Our
second algorithm empirically increases the scores attained over previous
heuristic partitioning methods applied in this context.Comment: Innovations in Theoretical Computer Science (ITCS) 202
Average-Case Complexity
We survey the average-case complexity of problems in NP.
We discuss various notions of good-on-average algorithms, and present
completeness results due to Impagliazzo and Levin. Such completeness results
establish the fact that if a certain specific (but somewhat artificial) NP
problem is easy-on-average with respect to the uniform distribution, then all
problems in NP are easy-on-average with respect to all samplable distributions.
Applying the theory to natural distributional problems remain an outstanding
open question. We review some natural distributional problems whose
average-case complexity is of particular interest and that do not yet fit into
this theory.
A major open question whether the existence of hard-on-average problems in NP
can be based on the PNP assumption or on related worst-case assumptions.
We review negative results showing that certain proof techniques cannot prove
such a result. While the relation between worst-case and average-case
complexity for general NP problems remains open, there has been progress in
understanding the relation between different ``degrees'' of average-case
complexity. We discuss some of these ``hardness amplification'' results
The Computational Complexity of Estimating Convergence Time
An important problem in the implementation of Markov Chain Monte Carlo
algorithms is to determine the convergence time, or the number of iterations
before the chain is close to stationarity. For many Markov chains used in
practice this time is not known. Even in cases where the convergence time is
known to be polynomial, the theoretical bounds are often too crude to be
practical. Thus, practitioners like to carry out some form of statistical
analysis in order to assess convergence. This has led to the development of a
number of methods known as convergence diagnostics which attempt to diagnose
whether the Markov chain is far from stationarity. We study the problem of
testing convergence in the following settings and prove that the problem is
hard in a computational sense: Given a Markov chain that mixes rapidly, it is
hard for Statistical Zero Knowledge (SZK-hard) to distinguish whether starting
from a given state, the chain is close to stationarity by time t or far from
stationarity at time ct for a constant c. We show the problem is in AM
intersect coAM. Second, given a Markov chain that mixes rapidly it is coNP-hard
to distinguish whether it is close to stationarity by time t or far from
stationarity at time ct for a constant c. The problem is in coAM. Finally, it
is PSPACE-complete to distinguish whether the Markov chain is close to
stationarity by time t or far from being mixed at time ct for c at least 1
Small Bias Requires Large Formulas
A small-biased function is a randomized function whose distribution of truth-tables is small-biased. We demonstrate that known explicit lower bounds on (1) the size of general Boolean formulas, (2) the size of De Morgan formulas, and (3) correlation against small De Morgan formulas apply to small-biased functions. As a consequence, any strongly explicit small-biased generator is subject to the best-known explicit formula lower bounds in all these models.
On the other hand, we give a construction of a small-biased function that is tight with respect to lower bound (1) for the relevant range of parameters. We interpret this construction as a natural-type barrier against substantially stronger lower bounds for general formulas
Approximate Degree, Secret Sharing, and Concentration Phenomena
The epsilon-approximate degree deg~_epsilon(f) of a Boolean function f is the least degree of a real-valued polynomial that approximates f pointwise to within epsilon. A sound and complete certificate for approximate degree being at least k is a pair of probability distributions, also known as a dual polynomial, that are perfectly k-wise indistinguishable, but are distinguishable by f with advantage 1 - epsilon. Our contributions are:
- We give a simple, explicit new construction of a dual polynomial for the AND function on n bits, certifying that its epsilon-approximate degree is Omega (sqrt{n log 1/epsilon}). This construction is the first to extend to the notion of weighted degree, and yields the first explicit certificate that the 1/3-approximate degree of any (possibly unbalanced) read-once DNF is Omega(sqrt{n}). It draws a novel connection between the approximate degree of AND and anti-concentration of the Binomial distribution.
- We show that any pair of symmetric distributions on n-bit strings that are perfectly k-wise indistinguishable are also statistically K-wise indistinguishable with at most K^{3/2} * exp (-Omega (k^2/K)) error for all k < K <= n/64. This bound is essentially tight, and implies that any symmetric function f is a reconstruction function with constant advantage for a ramp secret sharing scheme that is secure against size-K coalitions with statistical error K^{3/2} * exp (-Omega (deg~_{1/3}(f)^2/K)) for all values of K up to n/64 simultaneously. Previous secret sharing schemes required that K be determined in advance, and only worked for f=AND. Our analysis draws another new connection between approximate degree and concentration phenomena.
As a corollary of this result, we show that for any d deg~_{1/3}(f). These upper and lower bounds were also previously only known in the case f=AND
- …